4 research outputs found
Grounding Complex Natural Language Commands for Temporal Tasks in Unseen Environments
Grounding navigational commands to linear temporal logic (LTL) leverages its
unambiguous semantics for reasoning about long-horizon tasks and verifying the
satisfaction of temporal constraints. Existing approaches require training data
from the specific environment and landmarks that will be used in natural
language to understand commands in those environments. We propose Lang2LTL, a
modular system and a software package that leverages large language models
(LLMs) to ground temporal navigational commands to LTL specifications in
environments without prior language data. We comprehensively evaluate Lang2LTL
for five well-defined generalization behaviors. Lang2LTL demonstrates the
state-of-the-art ability of a single model to ground navigational commands to
diverse temporal specifications in 21 city-scaled environments. Finally, we
demonstrate a physical robot using Lang2LTL can follow 52 semantically diverse
navigational commands in two indoor environments.Comment: Conference on Robot Learning 202
CAPE: Corrective Actions from Precondition Errors using Large Language Models
Extracting commonsense knowledge from a large language model (LLM) offers a
path to designing intelligent robots. Existing approaches that leverage LLMs
for planning are unable to recover when an action fails and often resort to
retrying failed actions, without resolving the error's underlying cause.
We propose a novel approach (CAPE) that attempts to propose corrective
actions to resolve precondition errors during planning. CAPE improves the
quality of generated plans by leveraging few-shot reasoning from action
preconditions. Our approach enables embodied agents to execute more tasks than
baseline methods while ensuring semantic correctness and minimizing
re-prompting. In VirtualHome, CAPE generates executable plans while improving a
human-annotated plan correctness metric from 28.89% to 49.63% over SayCan. Our
improvements transfer to a Boston Dynamics Spot robot initialized with a set of
skills (specified in language) and associated preconditions, where CAPE
improves the correctness metric of the executed task plans by 76.49% compared
to SayCan. Our approach enables the robot to follow natural language commands
and robustly recover from failures, which baseline approaches largely cannot
resolve or address inefficiently.Comment: 8 pages, 3 figures, Under Review at ICRA 202
2ndWorkshop on Human-Interactive Robot Learning (HIRL)
With robots poised to enter our daily environments, they will not only need to work for people, but also learn from them. An active area of investigation in the robotics, machine learning, and humanrobot interaction communities is the design of teachable robots that can learn interactively from human input. To refer to these research efforts, we use the umbrella term Human-Interactive Robot Learning (HIRL). While algorithmic solutions for robots learning from people have been investigated in a variety of ways, HIRL, as a fairly new research area, is still lacking: 1) a formal set of definitions to classify related but distinct research problems or solutions, 2) benchmark tasks, interactions, and metrics to evaluate the performance of HIRL algorithms and interactions, and 3) clear long-term research challenges to be addressed by different communities. Last year we began consolidating the needed definitions and vocabulary to enable fruitful discussions between researchers from these interdisciplinary fields, and identified a preliminary list of long, medium, and short-term research problems for the community to tackle, and existing tools and frameworks that can be leveraged to this end. This workshop will build upon these discussions, focusing on promoting the specification and design of HIRL benchmarks